Mask estimation and sparse imputation for missing data speech recognition in multisource reverberant environments

نویسندگان

Heikki Kallasjoki

Sami Keronen

Guy J. Brown

Jort F. Gemmeke

Ulpu Remes

Kalle J. Palomaki

Kalle J. Palomäki

چکیده

This work presents an automatic speech recognition system which uses a missing data approach to compensate for environmental noise. The missing, noise-corrupted components are identified using binaural features or a support vector machine (SVM) classifier. To perform speech recognition using the partially observed data, the missing components are substituted with clean speech estimates calculated using sparse imputation. Evaluated on the CHiME reverberant multisource environment corpus, the missing data approach significantly improved the keyword recognition accuracy in moderate and poor SNR conditions. The best results were achieved when the missing components were identified using the binaural features and the clean speech estimates associated with observation uncertainty estimates.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mask estimation and imputation methods for missing data speech recognition in a multisource reverberant environment

We present an automatic speech recognition system that uses a missing data approach to compensate for challenging environmental noise containing both additive and convolutive components. The unreliable and noisecorrupted (“missing”) components are identified using a Gaussian mixture model (GMM) classifier based on a diverse range of acoustic features. To perform speech recognition using the par...

متن کامل

Mask estimation in non-stationary noise environments for missing feature based robust speech recognition

In missing feature based automatic speech recognition (ASR), the role of the spectro-temporal mask in providing an accurate description of the relationship between target speech and environmental noise is critical for minimizing the degradation in ASR word accuracy (WAC) as the signal-to-noise ratio (SNR) decreases. This paper demonstrates the importance of accurate characterization of instanta...

متن کامل

Observation uncertainty measures for sparse imputation

Missing data imputation estimates the clean speech features for automatic speech recognition in noisy environments. The estimates are usually considered equally reliable while in reality, the estimation accuracy varies from feature to feature. In this work, we propose uncertainty measures to characterise the expected accuracy of a sparse imputation (SI) based missing data method. In experiments...

متن کامل

Separating Speech From Noise Challenge

We have used the data from the PASCAL CHiME challenge with the goal of training a Support Vector Machine (SVM) to estimate a noise mask that labels time-frames/frequency-bins of the audio as 'reliable' or 'unreliable'. This noise mask could be used by another block in the signal processing pipeline to treat the unreliable data as missing and then replace the missing data with an estimate of the...

متن کامل

Noise robust digit recognition using sparse representations

Despite the use o f noise robustness techniques, automatic speech recognition (ASR) systems make many more recognition errors than humans, especially in very noisy circumstances. We argue that this inferior recognition performance is largely due to the fact that in ASR speech is typically processed on a frameby-frame basis preventing the redundancy in the speech signal to be optimally exploited...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Mask estimation and sparse imputation for missing data speech recognition in multisource reverberant environments

نویسندگان

چکیده

منابع مشابه

Mask estimation and imputation methods for missing data speech recognition in a multisource reverberant environment

Mask estimation in non-stationary noise environments for missing feature based robust speech recognition

Observation uncertainty measures for sparse imputation

Separating Speech From Noise Challenge

Noise robust digit recognition using sparse representations

عنوان ژورنال:

اشتراک گذاری